SemanticScuttle - klotz.me » Tags: ai+production engineering

Tags: ai* + production engineering*

0 bookmark(s) - Sort by: Date ↓ / Title /

Why developers are spinning up AI behind your back — and how to detect it. The article discusses the rise of 'Shadow AI' - developers integrating LLMs into production without approval, the risks involved, and strategies for organizations to manage it effectively.

>We’ve seen LLMs used to auto-tag infrastructure, classify alerts, generate compliance doc stubs, and spin up internal search tools on top of knowledge bases. We’ve also seen them quietly embedded into CI/CD workflows...

2025-05-08 Tags: llm, ai, cyber security, devops, observability, production engineering by klotz

El Reg's essential guide to deploying LLMs in production

Running GenAI models is easy. Scaling them to thousands of users, not so much. This guide details avenues for scaling AI workloads from proofs of concept to production-ready deployments, covering API integration, on-prem deployment considerations, hardware requirements, and tools like vLLM and Nvidia NIMs.

2025-04-28 Tags: llm, ai, production engineering, inference engineering, deployment, vllm, nvidia, kubernetes, inference, api, scaling, gpu, machine learning by klotz

Server approved! 4xH100 (320gb vram). Looking for advice

A user is seeking advice on deploying a new server with 4x H100 GPUs (320GB VRAM) for on-premise AI workloads. They are considering a Kubernetes-based deployment with RKE2, Nvidia GPU Operator, and tools like vLLM, llama.cpp, and Litellm. They are also exploring the option of GPU pass-through with a hypervisor. The post details their current infrastructure and asks for potential gotchas or best practices.

2025-04-28 Tags: h100, kubernetes, vllm, llama.cpp, gpu, ai, deployment, rke2, litellm, quantization, sxm, fp8, awq, gguf, production engineering, inference engineering, scale, reddit, localllama by klotz

Arize Phoenix

Arize Phoenix is an open-source observability library for AI experimentation, evaluation, and troubleshooting, built by Arize AI.

2025-02-08 Tags: arize phoenix, ai, observability, experiments, evaluation, troubleshooting, visualization, opentelemetry, openinference, production engineering, data engineering by klotz

K8sGPT joins the CNCF Sandbox

K8sGPT is a tool for scanning Kubernetes clusters, diagnosing issues in simple English, and enriching data with AI. It helps with workload health analysis, security CVE review, and more.

2024-09-20 Tags: k8sgpt, kubernetes, diagnosis, ai, sre, production engineering, devops by klotz

How AI Solves the Kubernetes Complexity Conundrum - The New Stack

2019-08-16 Tags: ai, kubernetes, production engineering by klotz

Context: Apache Spark for Artificial Intelligence and AI 2.0 - The New Stack

2019-04-21 Tags: kubernetes, ai, spark, mit, production engineering by klotz

“Plant AI — Deploying Deep Learning Models”

2018-11-24 Tags: ai, deep learning, deployment, production engineering by klotz

Algorithmia - Deploy AI at Scale

2018-08-21 Tags: ai, machine learning, deployment, algorithmia, production engineering by klotz

First / Previous / Next / Last / Page 1 of 0